mod_oai: An Apache Module for Metadata Harvesting
نویسندگان
چکیده
We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAI-PMH is the de facto standard for metadata exchange in digital libraries and allows repositories to expose their contents in a structured, application-neutral format with semantics optimized for accurate incremental harvesting. mod_oai differs from other OAIPMH implementations in that it optimizes harvesting web content by building OAIPMH capability into the Apache server.
منابع مشابه
Generic XML-based Framework for Metadata
We present a generic and flexible framework for building geoscientific metadata portals independent of content standards for metadata and protocols. Data can be harvested with commonly used protocols (e.g., Open Archives Initiative Protocol for Metadata Harvesting) and metadata standards like DIF or ISO 19115. The new Java-based portal software supports any XML encoding and makes metadata searc...
متن کاملAgris on-line Papers in Economics and Informatics
The paper deals with the necessity of systemic solution for metadata providing by local archives into central repositories and its subsequent implementatiton by the Department of Information Technologies, Faculty of Economics and Management, Czech University of Life Sciences in Prague, for the needs of the agrarian WWW AGRIS portal. The system supports the OAI-PMH (Open Archive Initiative – Pro...
متن کاملIntegrating Preservation Functions into the Apache Web Server
We will investigate the feasibility of “piggybacking” on the existing Internet infrastructure to facilitate digital preservation. Our focus is the web server itself, and whether it can be adapted to actively support preservation of web-accessible content. In particular, we explore the use of an Apache module to expose more of the available content and metadata than is currently found through cr...
متن کاملAn Approach to Log Management: Prototyping a Design of Agent for Log Harvesting
This document describes the state of development of agents. Agents capture logs from devices, normalize, reduce and cataloged them by using metadata. Once all these processes are done, they transmit the cataloged data by using Transportation Protocol to a warehouse server. Also an agent use orchestration parameters to transmit modified logs to a data warehouse server. These parameters can be re...
متن کاملSlug: A Semantic Web Crawler
This paper introduces “Slug” a web crawler (or “Scutter”) designed for harvesting semantic web content. Implemented in Java using the Jena API, Slug provides a configurable, modular framework that allows a great degree of flexibility in configuring the retrieval, processing and storage of harvested content. The framework provides an RDF vocabulary for describing crawler configurations and colle...
متن کامل